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1 Outline 

Today's lecture covers three main parts: 

• Courant-Fischcr formula and Raylcigh quotients 

• The connection of A 2 to graph cutting 

• Cheeger's Inequality 

2 Courant-Fischer and Rayleigh Quotients 

The Courant-Fischer theorem gives a variational formulation of the eigenvalues of a symmetric matrix, which 
can be useful for obtaining bounds on the eigevalues. 

Theorem 1 (Courant-Fischer Formula) Let A be an n x n symmetric matrix with eigenvalues Ai < 
A 2 < . . . < A„ and corresponding eigenvectors v\, . . . , v n . Then 

rp . X -Ax 

X 1 = mm x Ax = mm — = — , 

||x||=l x^O X 1 X 

x T Ax 



A 2 = min x Ax = min 

||x||=l x^O X 1 X 

x_Lui x±-V\ 

rp X^ AX 

A„ = A max = max x Ax = max 



x||=l x^O x T x 

Ln general, for 1 < k < n, let Sk denote the span of Vi, . . . ,Vk (with So — {0}), and let S^r denote the 
orthogonal complement of Sk- Then 

rp X A.X 

\k = mm x Ax = mm — ^ — . 

||xj|=l a;/0 X 1 X 

xes£_ 1 xesi_ l 

Proof Let A — Q T AQ be the eigendecomposition of A. We observe that x T Ax = x T Q T AQx = 
(Qx) T A(Qx), and since Q is orthogonal, = ||x||. Thus it suffices to consider the case when A = A is a 

diagonal matrix with the eigenvalues Ai, . . . , A„ in the diagonal. Then we can write 



x T Ax=(xi ■■■ x n ) : =^A 4 a;2. 





We note that when A is diagonal, the eigenvectors of A are v k — e k , the standard basis vector in R n , i.e. 
(ek)i = 1 if i = k, and (ek)i = otherwise. Then the condition x S S k L _ 1 implies x _L for i = 1, . . . , k — 1, 
so Xi — (x, ei) — 0. Therefore, for x £ with = 1, we have 

n n n 

X T Ax = X i x f = X * X i > X kY, X i= A fell X H 2 = A fe- 
i—1 i—k i—k 
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On the other hand, plugging in x — e S^r_ 1 yields x T Ax = (e^) 7 ' Aek = Afc. This shows that 



Afe = min x T Ax. 




Similarly, for ||x|| = 1, 



n n 



x Ax — ^ ' A^x^ < A max ^ y 



A, 



'max 




•max- 



On the other hand, taking x — e n yields x 



n 



A 



■max 



. Hence we conclude that 



A 



•max 



= max x T Ax. 



The Raylcigh quotient is the application of the Courant-Fischer Formula to the Laplacian of a graph. 

Corollary 2 (Rayleigh Quotient) Let G = (V, E) be a graph and L be the Laplacian of G. We already 
know that the smallest eigenvalue is Ai = with eigenvector v\ = 1. By the Courant-Fischer Formula, 



We can interpret the formula for A2 as putting springs on each edge (with slightly weird boundary 
conditions corresponding to normalization) and minimizing the potential energy of the configuration. 

Some big matrices are hard or annoying to diagonalize, so in some cases, we may not want to calculate 
the exact value of A 2 . However, we can still get an approximation by just constructing a vector x that has 
a small Raylcigh quotient. Similarly, we can find a lower bound on X max by constructing a vector that has 
a large Rayleigh quotient. We will look at two examples in which we bound A 2 . 

2.1 Example 1: The Path Graph 

Let P n +i be the path graph of n+ 1 vertices. Label the vertices as 0, 1, . . . , n from one end of the path to the 
other. Consider the vector x G R n+1 given by Xj = 2i — n for vertices i = 0, 1, . . . , n. Note that Ym=o Xi = ^> 
so x _L 1. Calculating the Rayleigh quotient for x gives us 



Thus we can bound A2 < 0(l/n 2 ). We knew this was true from the explicit formula of A 2 in terms of sines 
and cosines from Lecture 2, but this is much cleaner and more general of a result. 

2.2 Example 2: A Complete Binary Tree 

Let G be a complete binary tree on n = 2 h — 1 nodes. Define the vector x € K n to have the value on the 
root node, —1 on all nodes in the left subtree of the root, and 1 on all nodes in the right subtree of the root. 



A, 






An An 
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It is easy to see that 



0, since there are equal numbers of nodes on the left and right subtrees of 



the root, so x _L 1. Calculating the Raylcigh quotient of x gives us 



E 



E 



1 



O 



Thus we get A2 < O(lfn), again with little effort. It turns out in this case that our approximation is correct 
within a constant factor, and we did not even need to diagonalize a big matrix. 



3 Graph Cutting 

The basic problem of graph cutting is to cut a given graph G into two pieces such that both are "pretty 
big". Graph cutting has many applications in computer science and computing, e.g. for parallel processing, 
divide-and-conquer algorithms, or clustering. In each application, we want to divide the problem into smaller 
pieces so as to optimize some measure of efficiency, depending on the specific problems. 

3.1 How Do We Cut Graphs? 

The first question to ask about graph cutting is what we want to optimize when we are cutting a graph. 
Before attempting to answer this question, we introduce several notations. Let G = (V, E) be a graph. Given 
a set S C V of vertices of G, let S = V \ S be the complement of S in V. Let \S\ and \S\ denote the number 
of vertices in S and S, respectively. Finally, let e(S) denote the number of edges between S and S. Note 
that e(S) = e(S). 

Now we consider some possible answers to our earlier question. 
Attempt 1: Min-cut. Divide the vertex set V into two parts S and S to minimize e(S). This approach 
is motivated by the intuition that to get a good cut, we do not want to break too many edges. However, 
this approach alone is not sufficient, as Figure 1(a) demonstrates. In this example, we ideally want to cut 
the graph across the two edges in the middle, but the min-cut criterion would result in a cut across the one 
edge on the right. 

Attempt 2: Approximate bisection. Divide the vertex set V into two parts S and S, such that \S\ and 
\S\ are approximately n/2 (or at least n/3). This criterion would take care of the problem mentioned in 
Figure 1(a), but it is also not free of problems, as Figure 1(b) shows. In this example, we ideally want to 
cut the graph across the one edge in the middle that separates the two clusters. However, the approximate 
bisection criterion would force us to make a cut across the dense graph on the left. 





(a) Problem with min-cut (b) Problem with approximate bisection 

Figure 1: Illustration for problems with the proposed graph cutting criteria. 

Now we propose a criterion for graph cutting that balances the two approaches above. 
Definition 3 (Cut Ratio) The cut ratio <fi of a cut S — S is given by 

e(S) 



mm(\S\,\S\Y 
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The cut of minimum ratio is the cut that minimizes (p(S). The isoperimetric number of a graph G is 
the value of the minimum cut, 

0(G) = min cf>(S). 

As we can see from the definition above, the cut ratio is trying to minimize the number of edges across the 
cut, while penalizing cuts with small number of vertices. This criterion turns out to be a good one, and is 
widely used for graph cutting in practice. 

3.2 An Integer Program for the Cut Ratio 

Now that we have a good definition of graph cutting, the question is how to find the optimal cut in a 
reasonable time. It turns out that we can cast the problem of finding cut of minimum ratio as an integer 
program as follows. 

Associate every cut S — S with a vector x & {— 1,1}™, where 



X j 



1, if i G 5, and 
-1, if ieS. 



Then it is easy to see that we can write 



= ! E (Xi-Xj) 2 . 



4 



For a boolean statement A, let [A] denote the characteristic function on A, so [A] = 1 if A is true, and 
[A] = if A is false. Then we also have 

|S| • |S| = mi e S] ) (X>' e 3] - £[ieS,jeS] = ^ £ [x,+ Xj ] = ~ ^f- 

Kiev ) \jev J ijev i,jev i<j 



Combining the two computations above, 



J2u,j)eE( x i x j) 2 . e(S) 
- mm 



*e{-i,i}» Ei<j(xi-Xj) 2 scv|5|-|S|' 
Now note that if \V\ = \S\ + \S\ = n, then 

77 — — — 

-min(|S|,|5|) < |S|-|5|<nmin(|S|,|S|), 

so we get 

■ e ( g ) • Z(i,j)€E( x i ~ x i) 2 , . MS) 2 

— ©(G) = mm — — — < mm JL , r^^— < mm — , , - = —0(G). 

n vy ' scv nmin(|S|,|S|) " xe{-i,i}» T,i<j( x i ~ x j) 2 ~ scv nmm(\S\,\S\) n y ' 

Therefore, solving the integer program 

J2(i,j)£E( X i — X j) 2 

mm 

xe{-i,i}" T,i<j( x i- X j) 

allows us to approximate <fr(G) within a factor of 2. The bad news is that it is NP-hard to solve this program. 
However, if we remove the x € {—1, 1}™ constraint, we can actually solve the program. Note that removing 
the constraint x G { — 1, 1}™ is actually the same as saying that x € [—1, 1]", since we can scale x without 
changing the value of the objective function. 
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3.3 Interlude on Relaxations 



The idea to drop the constraint x G { — 1,1}" mentioned in the previous section is actually a recurring 
technique in algorithms, so it is worthwhile to give a more general explanation of this relaxation technique. 
A common setup in approximation algorithms is as follows: we want to solve an NP-hard question which 
takes the form of minimizing f(x) subject to the constraint x e C. Instead, we minimize f(x) subject to a 
weaker constraint x € C D C (see Figure 2 for an illustration). Let p and q be the points that minimize / 
in C and C , respectively. Since C C C , we know that f(q) < f(p). 




P 



Figure 2: Illustration of the relaxation technique for approximation algorithms. 

For this relaxation to be useful, we have to show how to "round" q to a feasible point q' <G C, and prove 
fW) < lf{l) f° r some constant 7 > 1. This implies f(q') < 7/(9) < jf(p), so this process gives us a 
7-approximation. 

3.4 Solving the Relaxed Program 

Going back to our integer program to find the cut of minimum ratio, now consider the following relaxed 
program, 

. ^2(i,j)eE( x i ~ x i) 



Y2 



Since the value of the objective function only depends on the differences Xi — Xj, we can translate x <G 
such that ill, i.e. X)"=i x * ~ ®- 
Then observe that 



n 

.2 



Y,{xi-Xj) 2 = nJ2 x i 

i<j i—1 

which can be obtained either by expanding the summation directly, or by noting that x is an eigenvector of 
the Laplacian of the complete graph K n with eigenvalue n (as we saw in Lecture 2). Therefore, using the 
Rayleigh quotient, 

. 12(i,j)eE( x i ~ xj) _ Y,(i.j)£E( x i - Xj) \ 2 

m L n 1 = mlI k 2 = — " 

xeR n 2^i<j\ x i " XjY xsk™ n l^i=i x i n 
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Putting all the pieces together, we get 



0(G) 



scv minOSI, |S|) 



n e(S) 
> - min 



2 scv \S\ ■ \S\ 



^(i.i)eE( X i X j) 2 



= — mm — = -. -rr 

2x6{-l,l}» E«j(*i-*j) 2 

^ n . J2u,j)eE( x i ~ x i) 2 
> — mm ' 

2*GR™ £i< 7 -(z 



2^ 2 



n . J2(i.-j)eE( x * ~ x j) 2 
= — mm 



2*eR" nE?=i«i 

A2 
2 ' 



4 Cheeger's Inequality 

In the previous section, we obtained the bound 0(G) > A2/2, but what about the other direction? For that, 
we would need a rounding method, which is a way of getting a cut from A2 and v 2 , and an upper bound on 
how much ithe rounding increases the cut ratio that we are trying to minimize. In the next section, we will 
see how to construct a cut from A 2 and v 2 that gives us the following bound, which is Cheeger's Inequality. 

Theorem 4 (Cheeger's Inequality) Given a graph G, 

^<\i<2<KG), 

where d max is the maximum degree in G. 

As a side note, the d max disappears from the formula if we use the normalized Laplacian in our calcu- 
lations, but the proof is messier and is not fundamentally any different from the proof using the regular 
Laplacian. 

The lower bound of 0(G) 2 /2d rnax in Cheeger's Inequality is the best we can do to bound X 2 - The square 
factor 0(G) 2 is unfortunate, but if it were within a constant factor of 0(G), we would be able to find a constant 
approximation of an NP-hard problem. Also, if we look at the examples of the path graph and the complete 
binary tree, their isoperimetric numbers are the same since we can cut exactly one edge in the middle of the 
graph and divide the graphs into two asymptotically equal-sized pieces for a value of 0(l/n). However, the 
two graphs have different upper bounds for A2, 0(l/n 2 ) and 0(l/n) respectively, which demonstrate that 
both the lower and upper bounds of A 2 in Cheeger's inequality are tight (to a constant factor). 



4.1 How to Get a Cut from v 2 and A 2 

Let x £ M. n such that x _L 1. We will use 1 as a map from the vertices V to K. Cutting R would thus give a 
partition of V as follows: order the vertices such that x\ < x 2 < . . . < x n , and the cut will be defined by the 
set S = {1, . . . , k} for some value of k. The value of k cannot be known a priori since the best cut depends 
on the graph. In practice, an algorithm would have to try all values of k to actually find the optimal cut 
after embedding the graph to the real line. 

We will actually prove something slightly stronger than Cheeger's Inequality: 
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Theorem 5 For any x _L 1, X\ < x 2 < ■ ■ ■ < x n , there is some i for which 

x T Lx > ^{l,...,*}) 2 

X^ X ^dmax 

This is great because it not only implies Cheeger's inequality by taking x = V2, but it also gives an actual 
cut. It also works even if we have not calculated the exact values for A2 and v 2 ; we just have to get a good 
approximation of v 2 and we can still get a cut. 

4.2 Proof of Cheeger's Inequality 

4.2.1 Step 1: Preprocessing 

First, we are going to do some preprocessing. This step does not reduce the generality of the proof much, 
but it will make the actual proof cleaner. 

• For simplicity, suppose n is odd. 

• Let m = (n+ l)/2. 

• Define the vector y by yt — Xi — x m . 

We can observe that y m = 0, half of the vertices are to the left of y m , and the other half are to the right of 

Vm- 

Claim 6 

x T Lx y T Ly 
x T x ~ y T y 

Proof First, the numerators are equal by the operation of the Laplacian, 

X T Lx= ^ ( X i- X j) 2 = ^2 (iVi + X ™) - (Vj + Xm)) 2 = ^ iVi ~ Vj) 2 = V T L V ■ 
(i,j)£E (hj)eE (i,j)£E 

Next, since ill, 

y T y = (x + x m l) T (x + x rn l) = x T x + 2x m (x T l) + x^(l T l) = x T x + nx 2 m > x T x. 
Putting together the two computations above yields the desired inequality ■ 

4.2.2 Step 2: A Little More Preprocessing 

We do not want edges crossing y m — (because we will later consider the positive and negative vertices 
separately), so we replace any edge with two edges (i,m) and (m,j). Call this new edge set E' . 

Claim 7 

Siev y% Eiev Vi 

Proof The only difference in the numerator comes from the edges (i,j) that we split into (i,m) and (m,j). 
In that case, it is easy to see that (also noting that y m — 0) 

{yj - Vi) 2 > {y 3 - y m ) 2 + (y m - yt) 2 - 
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4.2.3 Step 3: Breaking the Sum in Half 

We would like to break the summations in half so that we do not have to deal with separate cases with 
positive and negative numbers. Let E'_ be the edges (i, j) with i, j < m, and let E' + be the edges with 
i, j > m. We then have 

E ( ij)eB'(w - Vi? _ D(iJ)6BL(»< - Vi? + D(i,j)6s;(w - Vi? 



^2i Vi Si=l fi + Si=m 3/i 

Note that y m appears twice in the summation on the denominator, which is fine since y m = 0. We also know 
that for any a, b,c,d> 0, 

a + b .fab 

> mm 



c+ d \c' dj ' 

so it is enough to bound 

J2(i,j)eE'_(yi ~ %') 2 Z)(ij)e£' (ft - VjY 

and 



y^m 2 y-« 2 

Since the two values are essentially the same, we will focus only on the first one. 
4.2.4 The Main Lemma 

Let Ci be the number of edges crossing the point x i; i.e. the number of edges in the cut if we were to take 
S = {1, . . . Recall that 

= 0(G) = min . e 7 ^ 1Wv 
scv mm(|i>|, \b\) 

so by taking S = {1, . . . , i}, we get Ci > <fii for i < n/2 and Ci > <j)(n — i) for i > n/2. 
The main lemma we use to prove Cheeger's Inequality is as follows. 

Lemma 8 (Summation by Parts) For any Z\ < . . . < z m = 0, 

m 

(i,j)eE'_ »=i 
Proof For each € E'_ with i < j, write 

j-i 

\Zi -Zj\ = Zj -Zi = {Z i+1 - Zi) + {z l+2 - Z i+1 ) H h {Zj — Zj-i) = ^(zfc+l - z k ). 

k—i 

Summing over £ E'_, we observe that each term — z& appears exactly Ck times. Therefore, 

rn—l rn— 1 

(i,j)€E'_ fe=l fe=l 

Note that Zi < z m = 0, so \zt \ — —Zi for 1 < i < m. Then we can evaluate the last summation above as 

m— 1 

X] N - z j\ > X! fc ( Zfc + 1 ~ Zfe ) 
(i,j)eE'_ fe=i 

= ^((^2 ~ Zi) + 2(z 3 - z 2 ) + 3(z 4 - z 3 ) H h (m - l)(z m - z m _i)) 

= <j>(-Zi - Z 2 Zm-! + (m - l)z m ) 
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4.2.5 Using the Main Lemma to Prove Cheeger's Inequality 

Now wc can finally prove Cheeger's inequality. 

Proof of Cheeger's Inequality: This proof has five main steps. 

1. First, we normalize y such that YmLi Vi = 1- 

2. Next, this is perhaps a somewhat nonintuitive step, but we want to get squares into our expression, so 
we apply the main lemma (Lemma 8) to a new vector z with Zi = —y 2 - We now have 



E i»?-s?i>*Ei»?i = *- 

(i,j)eE'_ i=l 

3. Next, we want something that looks like (t/j — yj) 2 instead of y 2 — y 2 , so we are going to use the 
Cauchy-Schwarz inequality. 

/ \ 1/2 / x 1/2 

E i^ 2 -y 2 o\ = E if* ~ vj\ ■ if* + fji - E ( y * ~ ^') 2 E + y^ 2 

(i,j)eE>_ (i,j)€E'_ \(ij)eE>_ J \(i,j)eE'_ 

4. We want to get rid of the (yi + yj) 2 part, so we bound it and observe that the maximum number of 
times any yf can show up in the summation over the edges is the the maximum degree of any vertex. 

E (w + %) 2 ^ 2 E ( y2 + y D - 2 E dmax ■ y 2 - 2dmax - 

(i,j)eE'_ (i,j)eE'_ i=l 

5. Putting it all together, we get 

2 



12(i,j)eE'_(yi ~ yj) 2 (j2(ij)eE'_\yf Vj\) 

Similarly, we can also show that 



T,?=iVi > T,(i,j)eE>_(yi + yj) 2 > 2d « 



Therefore, 



Yli= m yf 



x T Lx y T Ly \ ^(i,j)eE'_ {Vi ~ Vi? ^(i,j)eE> fa ~ Vj? 1 <f> 2 
> „ > mm < === = , == > > 



xTx y T y ' I YT=iyf ' I 2d 



max 



4.2.6 So who is Cheeger anyway? 

Jeff Cheeger is a differential geometer. His inequality makes a lot more sense in the continuous world, and his 
motivation was in differential geometry. This was part of his PhD thesis, and he was actually investigating 
heat kernels on smooth manifolds. A heat kernel can also be thought of as a point of heat in space, and the 
question is the speed at which the heat spreads. It can also be thought of as the mixing time of a random 
walk, which will be discussed in future lectures. 
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